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-Abstract- 

We study frequency linear-time temporal logic (fLTL) which extends the linear-time temporal 
logic (LTL) with a path operator expressing that on a path, certain formula holds with at 
least a given frequency p, thus relaxing the semantics of the usual G operator of LTL. Such logic 
is particularly useful in probabilistic systems, where some undesirable events such as random 
failures may occur and are acceptable if they are rare enough. Frequency-related extensions of 
LTL have been previously studied by several authors, where mostly the logic is equipped with an 
extended “until” and “globally” operator, leading to undecidability of most interesting problems. 

For the variant we study, we are able to establish fundamental decidability results. We show 
that for Markov chains, the problem of computing the probability with which a given fLTL 
formula holds has the same complexity as the analogous problem for LTL. We also show that for 
Markov decision processes the problem becomes more delicate, but when restricting the frequency 
bound p to be 1 and negations not to be outside any G^ operator, we can compute the maximum 
probability of satisfying the fLTL formula. This can be again performed with the same time 
complexity as for the ordinary LTL formulas. 
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[Y] Introduction 

Probabilistic verification is a vibrant area of research that aims to formally check properties 
of stochastic systems. Among the most prominent formalisms, with applications in e.g. 
modelling of network security protocols [19] or randomised algorithms m, are Markov 
chains and Markov decision processes (MDPs). Markov chains are apt for modelling systems 
that contain purely stochastic behaviour, for example random failures, while MDPs can also 
express nondeterminism, most commonly present as decisions of a controller or dually as 
adversarial events in the system. 

More technically, MDP is a process that moves in discrete steps within a finite state 
space (labelled by sets of atomic propositions). Its evolution starts in a given initial state 
So- In each step a controller chooses an action ai from a finite set A(si) of actions available 
in the current state si. The next state s^+i is then chosen randomly according to a fixed 
probability distribution A(si,ai). The controller may base its choice on the previous evolu¬ 
tion soflo ... Ui-iSi and may also choose the action randomly. A Markov chain is an MDP 
where the set A(s) is a singleton for each state s. 

For the systems modelled as Markov chains or MDPs, the desired properties such as 
“whenever a signal arrives to the system, the system eventually switches off” can be often 
captured by a suitable linear-time logic. The most prominent one in the verification com¬ 
munity is Linear Temporal Logic (LTL). Although LTL is suitable in many scenarios, it does 
not allow to capture some important linear-time properties, for example that a given event 
takes place sufficiently often. The need for such properties becomes even more apparent in 
stochastic systems, in which probabilities often model random failures. Instead of requiring 
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that no failure ever happens, it is natural to require that failures are infrequent, while still 
having the power of the LTL to specify these failures using a complex LTL formula. 

A natural solution to the above problem is to extend LTL with operators that allow us 
to talk about frequencies of events. Adding such operators can easily lead to undecidability 
as they often allow one to encode values of potentially infinite counters laiz]. In both the 
above papers this is caused by a variant of a “frequency until” operator that talks about 
the ratio of the number of given events happening along a finite path. The undecidability 
results from laiT] carry over to the stochastic setting easily, and so, to avoid undecidability, 
care needs to be taken. 

In this paper, we take an approach similar to m and in addition to usual operators of 
LTL such as X, U, G or F we only allow frequency globally formulae that require 
the formula (/? to hold on p-fraction of suffixes of an infinite path, or more formally, G^(p is 
true on an infinite path sooosiai... of an MDP if and only if 


lim inf — 

n—^oc 77 , 


{i \ i < n and SiOiSi+iai+i... satisfies tp\ 


> P 


This logic, which we call frequency LTL (fLTL), is still a significant extension to LTL, 
and because all operators can be nested, it allows to express much larger class of properties 
(a careful reader will notice that nesting of frequency operators is not the main challenge 
when dealing with fLTL as it can be easily removed for the price of exponential blow-up of 
the size of the formula). 

The problem studied in this paper asks, given a Markov chain and an fLTL formula, 
to compute the probability with which the formula is satisfied in the Markov chain when 
starting in the initial state. Analogously, for MDPs we study the controller synthesis problem 
which asks to compute the maximal probability of satisfying the formula, over all controllers. 

For an example of possible application, suppose a network 
service accepts queries by immediately sending back responses, 
and in addition it needs to be switched off for maintenance 
during which the queries are not accepted. In most states, a 
new query comes in the next step with probability 0.5. In the 
waiting state, the system chooses either to wait further (action 
w), or to start a maintenance (action m) which takes one step 
to finish. The service is modelled as an MDP from Figure 
leaving some parts of the behaviour unspecified. The aim is to synthesise a control strategy 
that meets with a given probability the requirements on the system. Example requirements 
can be given by a formula GFm A GF(q—^Xr) which will require that the service 
sometimes accepts the request, and sometimes goes for maintenance. However, there is no 
quantitative restriction on how often the maintenance can take place, and such restriction is 
inexpressible in LTL. However, in fLTL we can use the formula GFm A G° ®’5(q Xr) to 
restrict tli&t the service is rimiiiiig sufficiently often, or el strong restriction Gr F m A Gl(q^ 
X r) saying that it is running with frequency 1. The formula may also contain several 
frequency operators. In order to push the frequency of correctly handled queries towards a 
bound p, the controller needs to choose to perform the maintenance less and less frequently 
during operation. 



M Figure 1 An example MDP. 


Related work Controller synthesis for ordinary LTL is a well-studied problem solvable in 
time polynomial in the size of the model and doubly exponential in the size of the for¬ 
mula [ 2 ]. Usually, the LTL formula is transformed to an equivalent Rabin automaton, and 
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the probability of reaching certain subgraphs is computed in a product of the MDP (or 
Markov Chain) with the automaton. 

A similar approach is taken by m- They study a logic similar to our fLTL, where 
LTL is extended with a mean-payoff reward constraints in which the reward structures are 
determined by validity of given subformulas. The authors show that any formula can be 
converted to a variant of non-deterministic Biichi automata, called multi-threshold mean- 
payoff Biichi automata, with decidable emptiness problem, thus yielding decidability for 
model-checking and satisfiability problems of labelled transition systems. Results of m 
cannot be applied to probabilistic systems: here one needs to work with deterministic auto¬ 
mata, but as pointed out in [211 Section 4, Footnote 4] the approach of |2I] heavily relies 
on non-determinism, since reward values depend on complete future, and so deterministic 
“multi-threshold mean-payoff Rabin automata” are strictly less expressive than the logic. 
Another variant of frequency LTL was studied in [^ [7] , in which also a modified until oper¬ 
ator is introduced. The work |B] maintains boolean semantics of the logic, while in |7] the 
value of a formula is a number between 0 and 1. Both works obtain undecidability results for 
their logics, and [^ also yields decidability for restricted nesting. Another logic that speaks 
about frequencies on a finite interval was introduced in |2L)j but provides analysis algorithm 
only for a bounded fragment. 

Significant attention has been given to the study of quantitative objectives. The work 
|S] adds mean-payoff objectives to temporal logics, but only as atomic propositions and 
not allowing more complex properties to be quantified. The work |3] extends LTL with 
another form of quantitative operators, allowing accumulated weight constraint expressed 
using automata, again not allowing quantification over complex formulas. |1] introduces lex¬ 
icographically ordered mean-payoff objectives in non-stochastic parity games and |5] gives 
a polynomial time algorithm for almost-sure winning in MDPs with mean-payoff and par¬ 
ity objectives. These objectives do not allow to attach mean-payoff (i.e. frequencies) to 
properties more complex than atomic propositions. The solution to the problem requires 
infinite-memory strategy which at high level has a form similar to the form of strategies we 
construct for MDPs. Similar strategies also occur in midniiH] although each of these works 
deals with a fundamentally different problem. 

In branching-time logics, CSL is sometimes equipped with a “steady-state” operator 
whose semantics is similar to our (see e.g. [I]), and an analogous approach has been 
taken for the logic PCTL [13 US] • In such logics every temporal subformula is evaluated 
over states, and thus the model-checking of a frequency operator can be directly reduced 
to achieving a single mean-payoff reward. This is contrasted with our setting in which the 
whole formula is evaluated over a single path, giving rise to much more complex behaviour. 


Our contributions To our best knowledge, this paper gives the first decidability results for 
probabilistic verification against linear-time temporal logics extended by frequency operators 
with complex nested subformulas of the logic. 

We first give an algorithm for computing the probability of satisfying an fLTL formula 
in a Markov Chain. The algorithm works by breaking the fLTL formula into linearly many 
ordinary LTL formulas, and then off-the-shelf verification algorithms can be applied. We 
obtain that the complexity of fLTL model-checking is the same as the complexity of LTL 
model checking. Although the algorithm itself is very simple, some care needs to be taken 
when proving its correctness: as we explain later, the “obvious” proof approach would fail 
since some common assumptions on independence of events are not satisfied. 

We then proceed with Markov decision processes, where we show that the controller 
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synthesis problem is significantly more complex. Unlike the ordinary LTL, for fLTL the 
controller-synthesis problem may require strategies to use infinite memory^ even for very 
simple formulas. On the positive side, we give an algorithm for synthesis of strategies for 
formulas in which the negations are pushed to atomic propositions, and all the frequency 
operators have lower bound 1. Although this might appear to be a simple problem, it is not 
easily reducible to the problem for LTL, and the proof of the correctness of the algorithm 
is in fact very involved. This is partly because even if a strategy satisfies the formula, it 
can exhibit a very “insensible” behaviour, as long as this behaviour has zero frequency in 
the limit. In the proof, we need to identify these cases and eliminate them. Ultimately, our 
construction again yields the same complexity as the problem for ordinary LTL. We believe 
the contribution of the fragment is both practical, as it gives a “weaker” alternative of the 
G operator usable in controller synthesis, and theoretical, giving new insights into many of 
the challenges one will face in solving the controller-synthesis problem for the whole fLTL. 
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Preliminaries 


We now proceed with introducing basic notions we use throughout this paper. 

A probability distribution over a finite or countable set A is a function d : A —> [0,1] 
such that 'Yhx^x denotes the set of all probability distributions over A. 


Markov decision processes and Markov chains A Markov decision process (MDP) is a 
tuple M = (S', A, A) where S is a finite set of states, A is a finite set of actions, and 
A : S X A —> I?(S) is a partial probabilistic transition function. A Markov chain (MC) is an 
MDP in which for every s € S there is exactly one a with A(s,a) being defined. We omit 
actions completely when we speak about Markov chains and no confusion can arise. 

An infinite path, also called run, in At is a sequence lv = soaosioi • • • of states and actions 
such that A(si, ai)(si_|_i) > 0 for all i, and we denote by a;(i) the suffix SiOiSi+iOi+i • • •. A 
finite path h, also called history, is a prefix of an infinite path ending in a state. Given a 
finite path h = soflosioi ■ ■ ■ Si and a finite or infinite path h' = SiOiSi+iOi+i • • • we use h ■ h' 
to denote the concatenated path soagSiai ■ ■ ■. The set of paths starting with a prefix h is 
denoted by Cyl{h), or simply by h if it leads to no confusion. We overload the notation also 
for sets of histories, we simply use H instead of U/iGff 

A strategy is a function a that to every finite path h assigns a probability distribution 
over actions such that if an action a is assigned a non-zero probability, then A(s,a) is 
defined where s denotes the last state in h. A strategy a is deterministic if it assigns Dirac 
distribution to any history, and randomised otherwise. Further, it is memoryless if its choice 
only depends on the last state of the history, and finite-memory if there is a finite automaton 
such that (T only makes its choice based on the state the automaton ends in after reading 
the history. 

An MDP A4, a strategy a and an initial state s™ give rise to a probability space P®*" 
defined in a standard way M- For a history h and a measurable set of runs U starting from 
the last state of h, we denote by the probability P^'^d/i ■ uj \ lo € U} \ h). Similarly, 

for a random variable A we denote by E®‘'*(A) the expectation of A in this probability 
space and by Ej(A) the expectation E®*"(Aft | h). Here, is defined by Xh{h-uj) = X{u}) 
for runs of the form h ■ uj, and by Xh{uj') = 0 for all other runs. We say that a property 
holds almost surely (or for almost all runs, or almost every run) if the probability of the 
runs satisfying the property is 1. 
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Frequency LTL The syntax of frequency LTL (fLTL) is defined by the equation: 

(fi ::= a \ \ (p\J ip \ l^ip \ pJJ ip \ G'^p 

where a ranges over a set AP of atomic propositions. The logic LTL is obtained by omitting 
the rule for G'^p. For Markov chains we study the whole fLTL whereas for MDP, we restrict 
to a fragment that we call 1-fLTL. In this fragment, negations only occur immediately 
preceding atomic propositions, and operators occur only with p = 1. 

For an infinite sequence 7 = X\x -2 ... of numbers, we set freq( 7 ) := liminfi_>oo j 
Given a valuation 1 / : S ^ , the semantics of fLTL is defined over a path w = SpaoSi ... 

of an MDP as follows. 

Lo \= a iff a € i^(so) w |= X iff a;(l) |= p 

uj 1= ->(/? iff w ^ w 1= (/3i U :/32 iff 3/c : \= P2 A Vt'<fc : Lo{t) |= pi 

w \= piVp 2 iff oj\=pi or oj\=p 2 oj 1 = G^p iff freq(l,^_ol¥ 3 .i • ■ ^ P 

where is 1 for oj iff Lo{i) ^ p, and 0 otherwise. We define true, false. A, and —> by their 
usual definitions and introduce standard operators F and G by putting Y p = true U p and 
Gp = ^p. Finally, we use Pct(<p) as a shorthand for Po.({w | w |= p}). 

► Definition 1 (Controller synthesis). The controller synthesis problem asks to decide, given 
an MDP M., a valuation v, an initial state Si„, an fLTL formula p and a probability bound 
X, whether P®™((^) > x for some strategy a. 

As an alternative to the above problem, we can ask to compute the maximal possible 
X for which the answer is true. In the case of Markov chains, we speak about Satisfaction 
problem since there is no strategy to synthesise. 

Rabin automata A (deterministic) Rabin automaton is a tuple R= where 

Q is a finite set of states, S is an input alphabet, (5:(3xS—J-Qisa transition function, and 
C Q X <5 is an accepting condition. A computation of R on an infinite word g = a^ai ... 
over the alphabet S is the infinite sequence = q^qi... with <70 = Qin and S(qi, Oi) = qi+i- 
A computation is accepting (or “i? accepts g'") if there is {E,F) G T such that all states 
of E occur only finitely many times in the computation, and some state of E occurs in it 
infinitely many times. For a run w = sooosini ■ ■ ■ and a valuation z/, we use v{lS) for the 
sequence h'{so)v{si) ... of sets of atomic propositions. 

As a well known result [2], for every MDP A4, valuation v and an LTL formula p there 
is a Rabin automaton R over the alphabet 2^^ such that R is constructible in doubly 
exponential time and co \= p iS R accepts 1 ^( 0 ;). We say that R is equivalent to p. It is 
not clear whether this result and the definition of Rabin automata can be extended to work 
with fLTL in a way that would be useful for our goals. The reason for this is, as pointed 
out in [ZD Section 4, Footnote 4], that the frequencies in fLTL depend on the future of a 
run, and so require non-determinism, which is undesirable in stochastic verification. 

["3"! Satisfaction problem for Markov Chains 

In this section we show how to solve the satisfaction problem for MCs and fLTL. Let us 
fix a MC A4 = (S', A), an initial state Sm and fLTL formula ip. We will use the notion of 
bottom strongly connected component (bscc) of Af, which is a set of states S' such that for 
all s G S' the set of states reachable from s is exactly S'. If s is in a bscc, by bscc(s) we 
denote the bscc containing s. 
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We first describe the algorithm computing the probability of satisfying ij) from and 
then prove its correctness. 

The algorithm The algorithm proceeds in the following steps. First, for each state con¬ 
tained in some bscc B, we compute the steady-state frequency Xs of s within B. It is the 
number E®(freq(l,,_ols.i ■ • ■)) where ls,i(w) equals 1 if the ith state of oj is s, and 0, oth¬ 
erwise. Afterwards, we repeat the following steps and keep modifying ip for as long as it 
contains any operators: 

1. Let be a LTL formula and p a number such that ip contains 

2. Compute for every state s contained in some bscc. 

3. Create a fresh atomic proposition which is true in a state s iff s is contained in a 

bscc and EtGbscc(s) > P- 

4. Modify ip by replacing any occurrence of G^ip with F 

Once Ip contains no operators, it is an LTL formula and we can use off-the-shelf techniques 
to compute ¥‘'"(ip), which is our desired value. 

Correctness The correctness of the algorithm relies on the fact that labels states in 
a bscc B if almost every run reaching B satisfies the corresponding frequency constraint: 

► Proposition 2. For every LTL formula p, every number p, every bscc B and almost every 

run oj that enters B we have w ^ G^p if and only if ^ P- 

The proposition might seem “obviously true”, but the proof is not trivial. The main 
obstacle is that satisfactions of p on uj{i) and a;(j) are not independent events in general: for 
example ii p = ¥ a and i < j, then w(j) \= p implies uj(i) ^ p. Hence we cannot apply the 
Strong law of large numbers (SLLN) for independent random variables or Ergodic theorem 
for Markov chains [m Theorems 1.10.1-2], which would otherwise be obvious candidates. 
Nevertheless, we can use the following variant of SLLN for correlated events. 

► Lemma 3. Let Foj^i ■ • ■ be a sequence of random variables which only take values 0 or 

1 and have expectation p. Assume there are 0 < r, c < 1 such that for all G N we have 
E((yi — p){Yj — p)) < . Then lim„_>oo “ P almost surely. 

Using the above lemma, we now prove Proposition for fixed p, B, p. Let R denote the 
Rabin automaton equivalent to p and A4 x Rhe the Markov chain product of A4 and R. 

First, we say that a finite path sq ... Sk of M is determined if the state qk reached by 
R after reading ^(sq ... Sfe_i) satisfies that (sfe, qt) is in a bscc of M x R. We point out 
that for a determined path sq ... Sk, either almost every run of Cyl{so ... Sk) satisfies p, or 
almost no run of Cyl{so ■ ■ ■ Sk) satisfies p. Also, the probability of runs determined within k 
steps is at least ~ _ j.Myk/M\ jg |;];jg number of states 

of Ad X i? and r is the minimum probability that occurs in Ad x i?. 

Now fix a state s G B. For all t G B and i > 0 we define random variables Xf over runs 
initiated in s. We let A|(a;) take value 1 if t is visited at least i times in w and the suffix of 
u! starting from the zth visit to t satisfies p. Otherwise, we let Xf(u!) = 0. Note that all A* 
have a common expected value pt = P*(</9). 

Next, let i and j be two numbers with i < j. We denote by 0 the set of all runs and 
by D the set of runs uj for which the suffixes starting from the zth visit to t are determined 
before the jth visit to t (note that D can possibly be 0). Because on these determined runs 
E'*(Aj - pt\D) = Q, we get 

E*((A‘ - pt){X] - Pt)) < 1 - ¥^{D) < (1 - 
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as shown in Appendix A.l Thus, Lemma applies to the random variables X* for a 


fixed t. Considering all t € B together, we show in Appendix A.2 that freq(l,^^ol¥>,i ■ ■ ■) = 
StGb 5 cc(s) for almost all runs initiated in the state s we fixed above. Because almost 

all runs that enter B eventually visit s, and because satisfaction of GP(p is independent of 
any prefix, the proof of Proposition is finished, and we can establish the following. 


► Theorem 4. The satisfaction problem for Markov chains and fLTL is solvable in time 
polynomial in the size of the model, and doubly exponential in the size of the formula. 


Controller synthesis for MDPs 

We now proceed with the controller synthesis problem for MDPs and 1-fLTL. The problem 
for this restricted fragment of 1-fLTL is still highly non-trivial. In particular, it is not 
equivalent to synthesis for the LTL formula where every is replaced with G . Indeed, 
for satisfying any LTL formula, finite memory is sufficient, while for 1-fLTL, the following 
theorem shows that infinite memory may be necessary. 

► Theorem 5. There is a 1-fLTL formula if and a Markov decision process M with valuation 
V such that the answer to the controller synthesis problem is “yes”, but there is no finite- 
memory strategy witnessing this. 

Proof idea. Consider the MDP from Figure together with the formula ■!/; = GFm A 
G^(q —?► Xr). Independent of the strategy being used, no run initiated in S 4 satisfies the 
subformula q —>■ X r, while every run initiated in any other state satisfies this subformula. 
This means that we need the frequency of visiting S 4 to be 0. The only finite-memory 
strategies achieving this are those that from some history on never choose to go right in the 
controllable state. However, under such strategies the formula G F m is not almost surely 
satisfied. On the other hand, the infinite-memory strategy that on i-th visit to sq picks m 
if and only if i is of the form 2-1 for some j satisfies if. 

Note that although the above formula requires infinite memory due to “conflicting” 
conjuncts, infinite memory is needed already for simpler formulae of the form G^(a U b). ◄ 

The above result suggests that it is not possible to easily re-use verification algorithms 
for ordinary LTL. Nevertheless, our results allow us to establish the following theorem. 

► Theorem 6. The controller-synthesis problem for MDPs and 1-fLTL is solvable in time 
polynomial in the size of the model and doubly exponential in the size of the formula. 

For the rest of this section, in which we prove Theorem we fix an MDP Ai, valuation 
v, an initial state Si„, and a 1-fLTL formula if. The proof is given in two parts. In the first 
part, in Section [4.1| we show that the controller-synthesis problem is equivalent to problems 
of reaching a certain set T and then “almost surely winning” from this set. To prove this, 
the “almost surely winning” property will further be reduced to finding certain set of states 
and actions on a product MDP (Lemma [I^ . In the second part of the proof, given in 
Section [T^ we will show that all the above sets can be computed. 

4.1 Properties of satisfying strategies 

Without loss of generality suppose that the formula if does not contain G^ as the outermost 
operator, and that it contains n subformulas of the form G^i^. Denote these subformulas 
.. .G^ipn- For example, -0 = i —> (G (q -A a) A G^(piUrV G^a)) contains = 
Pi U r V G^a and (^2 = a. 
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The first step of our construction is to convert these formulae tpi,..., (pn to equivalent 
Rabin automata. However, as the formulae contain operators, they cannot be directly 
expressed using Rabin automata (and as pointed out by |21j , there is a fundamental obstacle 
preventing us from extending Rabin automata to capture G^). 

To overcome this, we replace all occurrences of in such formulae by either true 

or false, to capture that the frequency constraint is or is not satisfied on a run. Such a 
replacement can be fixed only after a point in the execution is reached where it becomes 
clear which frequency constraints in ij) can be satisfied. For a formula ^ € {'0, ifii, ■ ■ ■ ‘Pn}, 
any subset I C {1,..., n} of satisfied constraints defines a LTL formula obtained from ^ 
by replacing all subformulas G^ipi (not contained in any other G^) with true if i G J and 
with false ii i ^ I. The Rabin automaton for is then denoted by R^j. For the formula ip 
above, we have, e.g., ip^^^ = = i —> (G (q —> a) A true), and ip\ = piUrV false. 

We use Q for a disjoint union of the state spaces of these distinct Rabin automata, and 
Qy, for a disjoint union of the state spaces of the automata Ripj, called main automata, for 
all I. Finally, for q G Q belonging to a Rabin automaton R we denote by R'^ the automaton 
obtained from R by changing its initial state to q. 

Let us fix a state s of TW and a state q of R^j for some I C {!,..., n}. We say that a 
run sooosifli • ■ • reaches (s, q) if for some k we have s = Sk and q is the state reached by the 
main automaton R^pj after reading ^(soOoSi ... Sfc_i). Once (s, q) is reached, we say that a 
strategy cr' is almost-surely winning from (s, q) if P® / assigns probability 1 to the set of runs 
w such that J^(w) is accepted by j, and uj ^ G^ipi wheneveiQwe have i G I. 

► Proposition 7. There is a strategy cr such that Pcr(0) = a:: if and only if there is a set 
T C 5" X for which the following two conditions are satisfied: 

1. There is a strategy cr' such that Pct'({w | w reaches a pair from T}) = x. 

2. For any {s,q) G T there is ag^q almost-surely winning from {s,q). 

Intuitively, the proposition relies on the fact that if G^ipi holds on a run, then it holds on 
all its suffixes, and says that any strategy cr can be altered so that almost surely there will 
be a prefix after which we know which of the G^(pi will be satisfied. 

► Example 8. Let us first illustrate the set T on a formula Xq A GFm A Gi(q ^ Xr) 
that can be satisfied on the MDP from Figurewith probability 0.5. Figure [^shows Rabin 
automata for the formulae ip^^^ = XqAGFmA true (left) and = q —Xr. In 
this simple example, the “decision” whether the formula will be satisfied (and which G^ 
subformulas will be satisfied) comes after the first step. Thus, we can set T = {(si, gi)}. 

We now prove Proposition The direction is 
straightforward. It suffices to define cr so that it behaves 
as cr' initially until it reaches some (s, q) GT for the first 
time; then it behaves as cr^ q. 

Significantly more difficult is the direction of Pro¬ 
position that we address now. We fix a strategy cr with 
Pcr(0) = X. The proof is split into three steps. We first 
show how to identify the set T, and then we show that 
iternsj^and]^ of Proposition]^ are satisfied. The last part 
of the proof requires most of the effort. In the proof, we 



H Figure 2 Example Rabin aut. 


1 


Note that the product construction that we later introduce does not give us “iff” here. This is also why 
we require the negations to only occur in front of atomic propositions 
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will need to eliminate some unlikely events, and for this we will require that their probability 
is small to start with. For this purpose, we fix a very small positive number A; to avoid 
cluttering of notation, we do not give a precise value of A, but instead point out that it needs 
to be chosen such that any numbers that depend on it in the following text have the required 
properties (i.e. are sufficiently small or big; note that such choice is indeed possible). We 
should stress that A is influencing neither the size of representation of our strategy nor the 
complexity of our algorithm. 

Identifying the set T 

In the first step, we identify an appropriate set T. Intuitively, we put into T positions 
of the runs satisfying ■)/) where the way ■i/' is satisfied is (nearly) decided, i.e. where it is 
(nearly) clear which frequency constraints will be satisfied by a in the future. To this 
end, we mark every run uj satisfying if with a set Ii^ C {!,...,n} such that i € iS 
the formula G^ipi holds on the run. We then define a set of finite paths F to contain all 
paths h for which there is Ih Q {1, ..., n} such that exactly the frequency constraints from 
Ifi as well as are satisfied on (nearly) all runs starting with h. Precisely, such that 

Pct({w' I oj' \= A luj' = Ih} I /i) > 1 — A. Finally, for every /i S F we add to T the pair 

(s, q) where h = h's and q is the state in which ends after reading v(h!). 

Reaching T 

It suffices to show that the strategy a itself satisfies Pcr(F) = x. We will use the following 
variant of Levy’s Zero-One Law, a surprisingly powerful formalization of the intuitive fact 
that “things need to get (nearly) decided, eventually”. 

► Lemma 9 (Levy’s Zero-One Law jl2|). Let a be a strategy and X a measurable set of runs. 
Then for almost every run uj we have lim„_,,oo Po.(X | hn) = lx(w) where each hn denotes 
the prefix of uj with n states and the function lx assigns 1 to ui G X and 0 to uj ^ X. 

For every I C {1,..., n} we define Xi = {uj' \ uj' \= A = 1} to be the set of runs that 
are marked by / and satisfy the formula fjR Then by Lemma for almost every run uj that 
satisfies tp and has = I, there must be a prefix h of the run for which '¥a{Xi | /i) > 1 — A 
because uj G Xj. Any such prefix was added to F, with R = I- 

Almost-surely winning from T 

For the third step of the proof of direction =A of Proposition we fix (s*,g*) G T and 
we construct a strategy as*^q* that is almost-surely winning from {s*,q*). Furthermore, 
let I* C {1,... ,n} denote the set such that q* is a state from R^j*. As we have shown 
in Theorem strategies might require infinite memory, and this needs to be taken into 
consideration when constructing as*,q*. The strategy cycles through two “phases", called 
accumulating and reaching that we illustrate on our example. 

► Example 10. Returning to Example]^ we fix (s*,( 7 *) = (si,( 7 i) and I* = {!}, with the 
corresponding history from F being sqWSi. The strategy <Jsi,qi we would like to obtain 

n first “accumulates” arbitrarily many steps from which all (p\^^ can be almost surely 
satisfied. I.e., it accumulates arbitrarily many newly started instances of the Rabin 
automaton (all being in state q^) by repeating action w in sq. 

H Then it “reaches” with all the Rabin automata and Rip^^{i} accumulated in the 

previous phase their accepting states q^ and qr respectively. For this happens 
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without any intervention of the strategy, but for the strategy needs to take the 

action m. Then after returning to sq it comes back to a state where the next accumulating 
phase starts. Thus, we need to make sure we make the accumulating phases progressively 
longer so that in the long run they take place with frequency 1. 

The proof that such a simple behaviour suffices is highly non-trivial. To illustrate this, let us 
extend the MDP from Figure [I| with an action decline with A(si,decline) = sq. The strategy 
a from the proof of Theorem ^satisfies Pcr('0) = lfor^ = GFm A G^(q—)-Xr). However, 
we can modify a and obtain a “weird” strategy a' that takes the action decline in the i-th 
visit to Si with probability 1/2®. Such a strategy (a) still satisfies Va'{fp) = 1/2 but (b) it 
does not guarantee almost sure satisfaction of in si. Thus, it does not accumulate in 
the sense explained above. We will show that any such weird strategy can be slightly altered 
to fit into our scheme. ◄ 


To show that alternation between such accumulating and reaching suffices (and to make 
a step towards the algorithm to construct such Us*,q*), we introduce a tailor-made product 
construction The product keeps track of a collection of arbitrarily many Rabin auto¬ 

mata accumulated up to now. We need to make sure that almost all runs of all automata in 
the collection are accepting, and we will do this by ensuring that: (i) almost every compu¬ 
tation of all Rabin automata eventually commits to an accepting condition (if, F), and (ii) 
from the point the automaton “commits” to the accepting condition, no more elements of 
E are visited and (iii) some element of F is visited infinitely often. To ensure this, we store 
additional information along any state q G Q of each automaton: 

H (g, tV) is a new instance that has to commit to an accepting condition soon; 

H (g, {E, F)o) is an instance that has to visit a state of F soon; 

H (g, {E, F),) is an instance that recently fulfilled the accepting condition by visiting F-, 

H (g, T) is an instance that violated the accepting condition it had committed to. 


Let C denote the set of these pairs for all g G Q and all accepting conditions {E, F) of the 
Rabin automaton where the state g belongs. Note that C is finite; because we need to encode 
unbounded number of instances of Rabin automata along the run, each element of a collection 
C C C might stand for multiple instances that are in exactly the same configuration. We 
say that C C C is fulfilled if it contains only elements of the form (g, {E,F),). The aim is 
to fulfil the collection infinitely often, the precise meaning of “recently” and “soon” above is 
then “since the last fulfilled state” and “before the next fulfilled state”. 

Using the product we show that if there is a satisfying strategy in M, there is 

a strategy in Af® with a simple structure that visits a fulfilled state infinitely often (in 
Lemma 13). Due to the simple structure, such a strategy can be found algorithmically. 
Finally, we show that such a strategy in the product induces a satisfying strategy in A1 (in 
Lemma 121 yielding correctness of the algorithm. 


The product Let Al® be an MDP with states S'® = S' x 2*^, actions A® = A x 2^, and 
transition function A® defined as follows. We first define possible choices of a strategy in 
Al®. Given a state (3,(7^), we say that an action {a, Co) is legal in {s^Cg) if a is a valid 
choice in s in the original MDP, i.e. A(s,a) is defined; and Ca satisfies the following: 

H for all tuples (g,A) G Cg we have (g, A) g Ca or {q,{E,F)o) G Ca for some accepting 
condition {E,F), {q can “commit” to {E,F), or keep waiting) 

H for all (g, x) G Cg with we have (g, x) G Ca, (all q are kept along with the commitments) 
H all {q,x) G Ca, not added by one of the two above items, are of the form {qin,A) where 
qin is the initial state of a Rabin automaton Rtp^j* for i G I*, (initial states can be added) 
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single action is available. 


The randomness in comes only from M. We set A^((s, Cs), (a, Ca))(t, Ct) = A(s, a){t) 
for any state (s,(7s), any action (a, Ca) legal in (s,Cs), and any state {t,Ct) such that Ca 
“deterministically evolves” by reading s into Ct- Precisely, we require that Ct is the minimal 

set such that for any {q,x) G Ca there is {q',x') € Ct with q ^ q' and x ^ x' where the 
latter relation is defined by -S-) ☆ and _L -w _L and for any • G {•,0} by 
H {E, F). -w (if, F). if g' ^ if U F and C is not fulfilled, (no special state visited) 

H (F, F). (F, F)o if ^ F U F and C is fulfilled, (resetting back to o) 

- {E,F) . if q' G F, (the accepting condition becomes fulfilled) 

H (F, F). _L if g^ € F; (the accepting condition is violated) 

Finally, a state is called fulfilled if its second component is fulfilled. 

► Example 11. Figure shows one path in the product Al® for the MDP and the Rabin 
automata from Example The path shown illustrates how the initial states can be added 
non-deterministically (in the first three steps), and then reaches a fulfilled state. 

A very useful property of the product is that any strategy that ensures visiting fulfilled 
states infinitely often yields a strategy in the original MDP such that the automata the 
strategy added almost surely accept. This is formalised in the following lemma. 

► Lemma 12. For a deterministic strategy tt in there is a strategy tt' in M. that for any 

h = (so,C'o) • • • (a„,F„)(s„+i,(7„+i) with V^{{fulfilled state visited infinitely often}) = 1.- 
. F^fifiso ... anSn+i) = and 

H for any (g, A)gF„ where R is the automaton of q, ({w | R‘^ accepts w}) = 1. 

To be able to use above lemma, we need to establish that it is sufficient to look for 
a strategy that visits fulfilled states infinitely often. In other words that existence of the 
satisfying strategy a implies existence of a strategy that visits fulfilled states infinitely often. 
Here we use the following lemma saying that a and (s*,g*) give rise to two strategies in 
the product Al® that can be used to add initial states into the collections, and to reach 
fulfilled states. We will show below how these strategies can be used to finish the proof of 
Proposition 

► Lemma 13. Assume s*,q*,I* are chosen as described on page^ Then there are sets 
M Q S 0 , N C A 0 where N contains only “accumulating“ actions, i.e. actions {a,C) with 
{(^mjA) I gj„ is the initial state of R,p^j* fori G /*} C C; and there are finite-memory 
deterministic strategies tt and f such that: 

1. When starting in (s, C) G M, tt only uses actions from N and never leaves M 

2. When starting in (s, C) G MU {(s*, {(g*, A)})}, ^ almost surely reaches a fulfilled state 
(possibly leaving M) and then reaches M. 
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Proof idea. The proof is involved and gives a crucial insight into the main obstacles of the 
proof of Theorem]^ Due to the space constraints we only sketch it here. 

We first prove that for any fixed i, almost every a; that satisfies all G^^pl has infinitely 
many good prefixes. Intuitively, a finite path h is good if, when starting from h, all the 
automata Rtp^j* for i G I* started within £ first steps accept with probability at least 1 — A. 

In the second step, we show how to avoid actions that cause that any Rtp^j* does not 
accept. To do so, we inductively start labelling the prefixes of runs of the MDP with elements 
of C. Having fixed a label for a prefix, the label for its extension is obtained by “deterministic 
evolving” as in the definition of the product MDP, and by (non-deterministically) adding 
The latter part is performed by switching between a “pseudo-accumulating” and 
“pseudo-reaching” phase. Initially, we start in a pseudo-reaching phase, only with singletons 
corresponding to the current state of and do not add any {qm, ^)- When a good prefix 

is reached (which happens almost surely), we switch to a pseudo-accumulating phase for the 
next i steps and we keep adding “initial states” of R^p^j* for each i G I*. After £ 

steps, we switch back to a pseudo-reaching phase and do not add any new elements to the 
label until we pass through a state whose label is fulfilled and get to a good prefix again, in 
which point another pseudo-accumulating phase starts. 

Along the way, we might obtain tuples of the form (g, T) in the label, or we might not 
ever visit a fulfilled state. Indeed, if we repeated our steps to infinity, such an “error” might 
take place almost surely. However, before an error happens with too high probability, the 
labels start repeating because C is finite. We show that supposing £ was large enough and 
our tolerance A was small enough, there must be a strategy that almost-surely traverses 
such a cycle without any error. We can extract from the pseudo-accumulating and pseudo- 
reaching phases of such a strategy the sets M (and N), given by the tuples of the MDP 
states (actions) and their labels. ◄ 

We are now ready to finish the proof of Proposition]^ We show that Lemma [T3| allows us 
to construct a strategy cr® for that almost surely (i) visits fulfilled states, and (ii) with 
frequency 1 it takes actions from N. By Lemma this strategy yields an almost-surely 
winning strategy <Js*,q* in Jvi. 

The strategy cr^ is constructed as follows. Inductively, for path h in Ad®, we say that 
its first accumulating phase starts in the first step, zth accumulating phase takes i steps, 
and the {i -\- l)th accumulating phase starts when the set M is reached through a fulfilled 
state after the ith accumulating phase ended. Within every accumulating phase started in a 
history h, cr® is defined to play as tt initiated after h. Similarly, outside every accumulating 
phase ended in a history /i, cr® is defined to play as C- 

4.2 The algorithm 

To conclude the proof of Theorem we need to give a procedure for computing the optimal 
probability of satisfying -ip. It works in the following steps (for details, see Appendix [b| : 

1. Initialize T := 0, and construct R^j for all ^ G {ip, (pi,..., <pn} and / C {1,..., n}. 

2. For every I find the largest sets (M/, Nj) satisfying the conditions [T||^ of Lemma [T^ and 
add to T all pairs (s, q) such that Mj can be almost-surely reached from (s, {(g, A)}). 

3. Compute an optimal strategy a' for “reaching” T and return the probability. Intuitively, 

_ we build the “naive” product of Ad with all the main automata R,pj for I C n}; 

_ reaching T is reduced to ordinary reachability of all states of the form {s,qi,..., qm) 

such that {s,qi) G T for some i. 
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_ By standard algorithms for reachability in MDP, we find an optimal strategy a" in 
the naive product that easily induces the strategy cr' in A^. 


By connecting Proposition]^ Lemmas [T^ and 13 and the construction of cr® above, there 
is a strategy u in Ad yielding probability > p iff the set T computed by the algorithm can 
also be reached with probability > p. 

We briefly discuss the complexity of the algorithm. Each of the Rabin automata in step]^ 
above can be computed in time 1?'' ^ and since there is exponentially many such automata 

(in IpI), step 1. takes time Stepj^can be performed in time poly(S') • 22’’°'''''''^''. In 

step we are computing reachability probability in the naive product MDP which is of size 
poly(S') • 22’’°'’'^''^''^ and so also this step can be done in time poly(S') • 22 ”°''''''^''. 


[~5] Conclusions 

We have given algorithms for controller synthesis of the logic LTL extended with an operator 
expressing that frequencies of some events exceed a given bound. For Markov chains we gave 
an algorithm working with the complete logic, and for MDPs we require the formula to be 
from a certain fragment. The obvious next step is extending the MDP results to the whole 
fLTL. This will require new insights. Our product construction relies on the (non-trivial) 
observation that given G^(p, the formula p is almost surely satisfied from any history of 
an accumulating phase. This is no longer true when the frequency bound is lower than 1. 
In such cases different histories may require different probability of satisfying p. However, 
both authors strongly believe that even for these cases the problem is decidable. Another 
promising direction for future work is implementing the algorithms into a probabilistic model 
checker and evaluating their time requirements experimentally. 
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I A I Details for proof for Markov chains 

Lemma Let Yi, 12 ■ • ■ be a sequence of random variables which only take values 0 or 1 
and have expectation /x. Assume there are 0 < r, c < 1 such that for all z, j S N we have 
E((Yi — fi){Yj — /x)) < , Then lim„_>oo J27=o M almost surely. 

Proof. We can use m Corollary 4] applied to random variables Zi = Yi — (we can¬ 
not use the result directly for Yi since m requires the random variables to have expect¬ 
ation value equal to 0). Clearly if lim„_>.oo then lim„_>.oo “ 

lim„_j.oo In = Finally, the corollary of [Uj indeed applies since J2T=o ^~k — — 

Er=o =r-^- < 1 A(1 - O < oo ◄ 

A.l Properties of random variables 

The following is a more detailed computation for properties of the random variables Xj. 
First, we need to extend the definition of a path being determined. We say that a path 
h is positively determined if almost every run of Cyl{so ... Sk) satisfies tp, and negatively 
determined if almost no run of Cyl{so ■ ■ ■ Sfc) satisfies tp. Now splitting runs of D to and 
D~ depending on whether the associated path is positively or negatively determined, we 
have: 

W{{xl-Pi){x]-Pi)) 

= ^^{D+)-W{{l-pi){X]-pt)\D+) 

+ V^{D-)-W{{-pi){X]-pt)\D-) 

+ \ {D+ U D-)) ■ E^((A* - /xt)(Aj -pt)\n\ {D+ U Z?")) 

= V%D+) ■ a - Pt) -WiX^^ - Pi \ D+) 

+ F%D-) ■ i-pt) -E^iX* - pt\ D-) 

+ \ {D+ U D-)) ■ E^((A* - pt){Xj -Pt)\n\ {D+ U D-)) 

and because E®(Aj — pt \ D+) = E^(Aj — pt) \ D~) = 0 (as shown later), we have 

= P^(fZ \ {D+ U D-)) ■ E^((X‘ - pi){X] -pt)\n\ {D+ U D-)) 

< {l-¥\D+ yjD-)) ■ 1 

< (1 _pM)L(i-j)/MJ 

Now let us show that E'*(Xj — pt \ D'^) = 0 by showing that E®(Aj | £>+) = pt- The 
argument for D~ is analogous. 

Let /ii, /i 2 ■ • ■ be the sequence of all finite paths ending in t, containing j occurrences of 
t, and satisfying the condition that the suffix from zth to jth occurrence of t is positively 
determined. The sets hk n partition and moreover P®(/ifc) = P®(/ifc H D~^). Hence, 
we have 


E^(A‘ I D+) = I D+)-E^{X!j I hk) = ^P"(/ife I iZ+) •E«(A‘) 

k k 

= pfY^V^ihk I D+) = Pt 

k 

where the second equality follows because the value of Xj does not depend on the prefix up 
to the jth visit to t, and because hk is a cylindric set. 
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A.2 Relating frequency with probability of achieving bsccs. 

The following is the final computation for the proof of Proposition Below, we use for 
the random variable that for a run uj returns the number of visits to t on the prefix of co of 
length n. 


n Af; 

liminf i/n = liminf 'S^Xl/n 

n^oo ^^ ’ n^oo ^ ^ ^^ 

i=0 tGbscc(s) i=0 

TV* 

= lim inf /n 

^ n^oc ^ 
tGbscc(s) i—0 


= liminfy]X‘/(n/aTt) 

f ^ T). —^r>o t ^ 


n—^oc 
tGbscc(s) i—0 


= Xiliminfy^ ^i/'^ 

^ n-)-oo ^ 

tGbscc(s) 4=0 


= Y 

t£bscc{s) 


B I Details for proof for Markov decision processes 
B.l Proof of Lemma \T2\ 


Lemma 12 


For a deterministic strategy tt in At® there is a strategy tt' in At that for any 
• • {an, Dn){sn+i, Cn+i) with Pjf ({fullfilled state visited infinitely often}) = 1: 


h — (so, Co) 

. P^Kso . ..anSn+i) = and 

H for any {q,^)(zDn where R is the automaton of q, P®‘;“° 


‘({a; I R‘^ accepts w}) = 1. 


Proof. For every finite path h = sqUo ■ ■ ■ a„s„+i in AJ there is at most one path of the form 
(sO) Co)(ao, Co) • • • (a„, C„)(s„+i, C„+i)), denoted which satisfies that: 

B with Co fixed above 

H all Di are as chosen with probability 1 by the deterministic strategy cr^ and 
H all Ci all given uniquely by the definition of Al^ 

We define the strategy tt' by 7 r'(h) = n{h^) for all h starting with sq, and define 7 r'(/i) 
arbitrarily otherwise. 

Let h = (so,Co) • • • (a„,C„)(s„+i,C„+i) beapath. Clearly, P^'l(so ■ --anSn+i) = 
by the definition of tt. Also, since tt, when starting after h, almost surely fulfils infinitely 
often, it also never reaches any state with second component containing {q, _L). Hence, it is 
easy to see from the definition of Al® that Pi^g’^°^(/i) > 0, then From the definition of Ai® 
it is easy to see that fulfilling infinitely often implies that for all (g, A) G Dn the automaton 
(where R is the automaton containing q) almost surely accepts suffixes of sooo • ■ ■ Sfc- 
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B.2 Proof of Lemma [T3] 


First of all, let us introduce further definitions that we will require later in the proof. For 
a run w and a state 9 of a Rabin automaton R, by infq(a;) (resp. 0CCq{uj)) we denote the 
set of states of R that occur infinitely many times in (resp. that occur at least once in 
i?[w]) when the Rabin automaton is started from state q instead of qin- 


Lemma 13, Assume s^,q*,I* are chosen as described on page Then there are sets 
M C Sig,, N C Ag where N contains only “accumulating" actions, i.e. actions (a,C) with 
{(^m) A) I qin is the initial state of Rip^j* for i G I*} C C; and there are finite-memory 
deterministic strategies tt and ( such that: 

1. When starting in {s,C) G M, tt only uses actions from N and never leaves M 

2. When starting in (5,(7) S MU {(s*, {(<?*, A)})}, ( almost surely reaches a fulfilled state 
(possibly leaving M) and then reaches M. 


We will now prove Lemma 13 As before, fix s* G S, q* G Q, and I* C {1,... n}, and we 
also fix a finite path h* from F witnessing that (s*, g*) G T. We also Gx £ = -I- 2 and 

K = 3 ■ \£\^ • 2” • A where A is the small number introduced at page|^ 

The following definition and Lemma 19 will help us identify (possible) recurring beha¬ 
viour of a. We need to identify long enough parts of runs where all the frequency formulae 
(fl* are satisfied with probability very close to 1. Based on the behaviour of tr within these 
parts, we later define the “accumulating” strategy. 

For a path h, let \h\ be length of h, i.e. the number of states in h. We say that a finite 
path h extending h* is good if 


e-i 

E<,(^Y|ft|+fe \h)>£-{l-X). 

k=0 

where Yj^uj) = )\=ip^* indicator function that the suffix of to starting at j-th 

position satisfies all tpl . 

► Lemma 17. Let X be a set of runs, /3 > 0, and let Ji = {h \ \h\ = i and P(A \ h) > fi} 
then limi_>.oo Ji = X, i.e. for every uj there is i such that for all i' > i we have uj G Ji iff 
ujGX. 


Proof. If Lo G X, then by Lemma there is i such that for all i' > i we have P(A \ h) > ft 
where h is the prefix of uj of length i'. Then i is the required number, li uj ^ X, then again 
by Lemma there is i such that for all i' > i we have P(X \h) < (5 where h is the prefix of 
UJ of length i'. Then again we pick i. ◄ 


The following lemma allows us to simplify the notation and only deal with one frequency- 
globally formula ip := /\i^j„ (ftj*. 

► Lemma 18. Let ^ 1 ,... be LTL formulae, and uj a run. We have uj ^ Ar=i */ 
only z/w A A”=i A- 

► Lemma 19. Almost every uj satisfying Aig/* has infinitely many good prefixes. 

Proof. By contradiction. Employing Lemma we can slightly simplify the problem and 
consider runs satisfying G^^tp for p = /\i^j* pj Suppose that there is a set X' with 
Pcr(Al') > 0 such that all uj G X' satisfy and have only finitely many good prefixes. 

Further, let for a run uj G X' denote the smallest number such that for all m' > m^, 
the prefix h of w of length m' is not good. We can pick m and X G- X' satisfying that 
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Pcr(X) > 0, and every oj G X satisfies that rriuj < m. Note that such choice is possible, as 
with increasing m the set X tends monotonically to X'. 

Note that we have 


1 


Eo-fliminf — Yi I X) = 1 

n —>-00 Ti 


( 1 ) 


i=0 


Furthermore, by Fatou’s Lemma, by linearity of expectation, and by taking a subsequence 
of averages of chunks of length £, we have 

1 


Eg- (lim inf — Yi\ X) 

n. —^r>o n • ^ 




< liminf-y^E„(y, I X) 

n—>-oo 77, 


2=0 




j~0 i—0 


Let Ji be the set defined as in Lemma 17 and denote co-Ji = {h \ \h\ = z} \ Ji- For 
£■ j > m we have 


1 




y^ Ecr(y^-j+i I X) 


i=0 


1-1 






2 = 0 
l-l 


= -. Y ^<^ih\x)Y Y ^■^^{yH+i = ^\h£^x) 


(law of total expectation) 


(def. of expectation) 


]=i-] 


i=0 xG{0,l} 
t-1 


1 '^a{h n X) V(,(Y£.j^i — 1 n /i n x) 


-IE 


E 


\h\=£-j '' i=0 


V^Xnh) 

(removing 0 terms; def. of cond. probability) 


= - y 


e-i 


£ ^ P,,(X) 

\h\=t3 '' i=0 


Y^Aye-j+i = ir\hr\X) 


r{hr\ X) cancels out) 


1 1 / 


( y y (p.(r,.,+, = in/z)-p,,(y^.,+, = in(/i\x 

\ 7. . ■i—n 


£PAX) 

y E E F,(Yi.,+i = inhnx) 


' heJi.j i—O 

e-1 


< 


1 1 

IfYx) 


1 1 
~£F,{X) 


hGco-Ji.j 2=0 

(partitioning in paths (not) in Jg.j; set operation) 

i-i i-i 

EE Fa{Y£.j+i — 1 n /l) + E E F^{hr\X) 

hGJ^.j 2=0 hGco-J^.j 2=0 

(removing negative terms) 

i: i: GP.cnx)) 

^ 7 , . .--n , _ / 


h^J^.j 2=0 


co-j£.j 

(multiplying some summands by F^{h)/Fa{h)) 
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“ £ p (X) ( — 11 ^) + -^ • Po-(co j£.j n X) 

(def. of cond. prob.) 

~ £ P (X) ■ £ ■ (1 — A) + £ ■ P^(co-j£.j n 

(property of Ji.j for £ ■ j > m) 

^ • (1 - A) P^(cO-J^j- n X) 

Pa(^) Pa(^) 

and hence 


^ k — 1 

(*) < liminf — 

fc—^oo k ^^ 

i=o 


• (1 - A) 
P<.(X) 


P^{co-Ji.j n x) \ 

P.(^) ) 


and since in lim£.j_j,oo Ji-j = X, we get 


= (1-A) 

which is a contradiction with Q. ◄ 

By heavily relying on existence of good prefixes, we define labellings of histories of Xi 
that will help us establish a connection to Af®. Namely, the labellings (1) identify what is 
the current state in Af® and (2) resolve the additional choices w.r.t. the second component 
of Al®. 

We introduce functions 6s and Oa that label histories starting with h* with elements 
of 2^ U {A} and define the current state and the current action to pick in Af® in the 
given history, respectively. Inductively, together with defining the labellings, we also assign 
one of two distinct tags to these histories, pseudo-accumulating or pseudo-reaching. We 
will then speak about pseudo-reaching and pseudo-accumulating phases which are maximal 
consecutive ranges within histories labelled so far such that all prefixes in this range are 
tagged as pseudo-reaching or pseudo-accumulating, respectively. A pseudo-reaching phase 
is fulfilled if it contains a prefix h in its range such that 9s{h) is fulfilled. 

Initially, we tag the history h* of M as pseudo-reaching and set ds{h*) = 0 and dA{h*) = 

Suppose that BaW has already been defined and the tag of a history h of M has been 
determined. 

First for an action a and state t, we tag the extension h-a-t of h as pseudo-accumulating 
if (i) h is tagged as pseudo-accumulating and the length of the current pseudo-accumulating 
phase is less than £ so far; or (ii) h is in a pseudo-reaching phase such that some prefix of h 
within that phase is fulfilled and h is good. Otherwise we tag h ■ a ■ t as pseudo-reaching. 

Next, we define 6s{h ■ a ■ t) hy “deterministically evolving” by reading the last state of 
h as in the definition of Ad,gi at page i.e. 6s{h ■ a ■ t) is the minimal set such that for 

any (<?, x) G dA{h) there is {q',x') G 6s{h ■ a ■ t) with q q' and x ^ x' where the latter 
relation is the relation from the definition of Al® 

We define dA^h ■ a ■ t) to be the minimum element (w.r.t. set inclusion) of C satisfying 
the following 

H If h • a • £ is in a pseudo-accumulating phase, then dA{h ■ a ■ t) contains {qin,X) for the 

initial states of for all i G I*. 
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H For all {q, *) £ 0s{h ■ a ■ t) such that for some {E, F), 

P^({w I occq(w) n E = 0,infg(w) n F ^ 0}) > 1 — A 

we put {q, {E,F)o) G dA{h ■ a-t), and otherwise we put (g,tV) g 6A{h ■ a-t). In the case 
there are several {E,F) satisfying the condition above, we pick the least one w.r.t. an 
arbitrary but a priori fixed total order. 

_ For all {q, {E, F)^;) G Osih ■ a ■ t) we put (g, (F, F)^) G 0A{h -a-t). 

Note that the minimum element satisfying these conditions always exists. Also note that 
these definitions are analogous to those in but in addition we give a rule for “commit¬ 
ting” to an accepting condition. 

Finally, any (finite or infinite) path w = sooosiai ... in Ad initiated in h* corresponds 
to a path 

w® = (so, 6 's(so))(ao, 6 »a(so))(si, 6 ls(soaoSi))(ai, 0^1(500031))... 

in Ad®. Similarly, the strategy a gives rise to a strategy cr® defined, for all h, by 0A{h(^)) 

a{h){a). The connection between the labellings and the MDP Ad® is completed by the fol¬ 
lowing lemma that can be proven immediately from the definitions. 

► Lemma 20. For any set T, P^*(T) = Pct^({w® | o; G T}). 

Note that the strategy u® in Lemma is possibly still very complex in its structure 
and in particular can reach states with (q, _L) in the second component. We however show 
that within a certain finite horizon this happens with a small probability. 

Let depth{h) be the number of pseudo-accumulating phases along the path h . Let T be 
the set of runs oj that have depth^cj) > £, and for which no prefix h with depth(h) < I has 
0A(h) = -L. We will show below that the probability of runs in T is very large. 

► Lemma 21. Pcr(T | h*) > 1 — 3 • • A = 1 — k. 

Proof. First, we start with the set of runs 

= {w I w h A /\ GVf} n h* 

with Po-(n \ [/ I h*) < A as given by the assumption of Lemma [I^ (here denotes the set 
of all runs). 

Furthermore, let P C be the set of runs where all the “accumulated” Rabin automata 
accept, i.e. runs w such that for all i G I* and for all prefixes Hq in an at most £-th 
pseudo-accumulating phase, we have that R^p-j accepts uj' where u! = ho ■ oj'. For a fixed 
accumulating phase which starts at some good history h, we have (denoting Ylk=o '^\h\+k by 
Yiph where \h\ is the number of states in h) 


e-{l-X)<E,{Y^ph I h) 

< £ ■ F^iYiph = £\h) + i£-l)- F^iYiph <£\h), 

yielding F^{Yxph = £ \ h) > 1 — £\. Thus for Wp/i denoting the number of Rabin automata 
accepting in all i accumulating phases, we easily obtain P(j(Wpft, = £■ £ \ h*) > 1 — £^-A and 
thus F^{U\V) <£2.a. 

For every f G /, we say that starting after h, the history h' decides for an acceptance 
condition {E,F) of Rip-j* if 
H /i is in pseudo-accumulating phase. 
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H h' is the shortest history such that for some {E',F') 

({w I occq(uj) n E = 0, infq(uj) n E ^ 0}) > 1 — A 

where q is the state in which ends after reading h', and 

H {E,F) is the minimal one among such acceptance conditions {E\F') (w.r.t. the above 

fixed order). 

We define a set W C V of runs where this “decision” turns out to be correct for all 
automata started in the first £ accumulating phases. Technically, uj G W if for every i G I* 
and every splitting w = h-h' ■ui 2 such that h is in an at most .^-th pseudo-accumulating phase 
we have the following. If starting after h, h' decides for some {E, F), we have 0 CCq{uj 2 )<AE = 0 
and infg(a; 2 ) D F ^ 0 where again q is the state in which ends after reading h'. 

When starting after a single h, h' decides for some {E, F), the probability of not sticking 
to this decision is by definition at most A (conditioned by h-h'). Similarly as before, there are 
at most x |/| decisions to take, yielding the overall probability at most Pct(F\W) < \I\-X-£‘^ 
of runs that do not stick to decisions up to i. 

For almost every run u: gV we have that u) G T if uj G W. Indeed, inductively, for all 
prefixes hat of u such that h is in an at most f'-th pseudo-accumulating or pseudo-reaching 
phase and ^ -L, we have 9A{hat) yf _L because no forbidden state g of a previously 

decided automaton is visited along any uj of W. Furthermore, every label {q,{E,F)o) is 
eventually replaced by {q, {E,F),) because uj G V' and every (g, tV) is eventually replaced 
by some {q, {E,F)o) (for almost every lo GV) due to Lemma given below. Thus, the set 
of labels along w becomes at least £ times fulfilled. 

Summing up P^(n \ [/), P^(C/ \ V) and Pj(tA \ VF), we obtain the statement of the 
lemma. ◄ 

An important step in the previous proof was that on almost every accepting path there 
is a prefix where the Rabin automaton “decides” for one accepting condition with high 
probability. The proof is again based on Levy’s Zero-One Law. 

► Lemma 22. Let R be a Rabin automaton, h be a path, V = {h ■ u \ R accepts uj}, and 
P^(R) > 0. For almost allh-uj' gV there is a prefix h' ofuj' and an acceptance pair (E,F) 
of R such that 

{{uj I occq(w) r\E = 0, inf q{uj) n F ^ 0}) > 1 - X 
where q is the state in which R ends after reading h'. 

Proof. Let the acceptance conditions of R be (i?i, Fi)i<i<„ and its initial state be go- For 
each i, let Ri be the set 

{h- uj' gV \ infqj ,( w ') r\ Ei = 0,inf qg{u}') n Ft ^ 0} 

and Ifi- be its indicator function. As for each h ■ uj' in some Ri, (ft- • uj') = 1, we also 
have from Lemma (Levy’s zero one law) that limfc_,,oo PCT(i?i | h}f) = 1 where ft^ are the 
prefixes of h ■ uj' of length k. Hence, there is k such that all prefixes hk> for k' > k satisfy: 

P,,(i?, I ftfc) > 1-(2) 

Let us fix an arbitrary partition of V into disjoint sets R'l, ..., R'^ such that for all 
1 < i < n, Ft!^ G_ Ri. For each i and run h ■ uj' = sgag ■ ■ ■ let 

lastSini{uj') = sup({0} U {n | s„ G Ri\)- 
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Let h ■ uj' € R[. As we have l{iastSini>iastSini(h-u,')}{h ■ w') = 0, we also have from 
Lemmathat limk^ao^a{{lastSini > lastSini{h ■ w')} | hk) = 0. Hence, there is /c G N 
such that k > lastSini(h ■ cu') and all prefixes hk' of length k' > k satisfy 

Fcr{{lastSini > k} \ hk) < (3) 

In total, we obtain fromandthe desired statement. ◄ 

Before constructing the accumulating and reaching strategies, we state the following 
lemmas that we will need. 

The first lemma says that if we can achieve a certain event with a large enough probability 
in an MDP, then we can achieve it with probability 1. The proof follows from the fact that 
there are optimal deterministic strategies with memory of size 2. 

► Lemma 23. Let M be an MDP with state space S, where each state s is labelled with an 
atomic proposition s unique to this state, let p be the minimal probability occurring in it, 
and let Gi and G 2 be two sets of states. The following statements hold true: 

1 . If supg. Pct(F (Gi a F G 2 )) > 1 — then Po-(F (Gi A F G 2 )) = 1 for some a. 

2. If supj,. Pcr({w = soflosiai ■ • ■ | Vi < IS”! : G Gi}) > 1 — then Pct(G Gi)) = 1 for 
some a. 

Proof. Let us analyse the second case which is slightly more technical. The set {w = 
sooosiai ... I Vi < [S'! : Si G Gi} can be captured using an LTL property and so the 
supremum is realised by some deterministic strategy ah Suppose it is lower than 1. Then, 
since a' is deterministic, there must be a history sooosioi ■ ■ ■ Si for i < jS”! such that Si ^ Gi 
and Pcr'(soaoSiai ... Si) > p*, which is a contradiction. 

The first case can be proved similarly, we only need to consider that deterministic 
strategies with memory of size 2 are sufficient to achieve the supremum. ◄ 

From now on, we will consider the strategy in instead of cr. We transfer the 
labelling with a pseudo-reaching and pseudo-accumulating phase to runs of Alig, in the 
straightforward way. 

Let Wi be the set of histories that are in i-th pseudo-accumulating phase and whose 
predecessors are not in i-th pseudo-accumulating phase. In order to define accumulating 
and reaching strategies, we need to select subsets of these histories that are “connected” 
with high probability. We thus select non-empty sets Wi C Wi which in addition satisfy 

P<T®(^i) > 1 ~ for all l<i<(., and 

Po'®(^i+i I ^) > 1 ~ for all h£Wi and l<i<l 

(recall that we interchangeably interpret a set of histories also as a set of runs starting with 
some history from the set). This is indeed possible, it suffices to put Wt = Wi that satisfies 
the first condition by Lemma Supposing Wi+\ has been defined, we get Wi by the 
following. 

► Lemma 24. Let e > 0 and ¥ be a probability measure. Further, let U be a set of runs such 
that P(C7) > 1 — e, and let H be the prefix-free set of finite paths such that ¥(H) > 1 — e. 
There is a set V C H with P(F) > 1 — 2fye and ¥{U | re) > 1 — fye for all w GV. 

Proof. We can assume e < 1/4, otherwise the statement holds for any set V as 2yfy > 1. 
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We set V := {y G H \ F{U | y) > 1 — y/e}- We claim that P(F) > 0. Let us assume the 
contrary. This means that all y G H with F{y) > 0 satisfy F{U | y) < 1 — fye < 1 — 2e < 
P(t7) — e and hence, also F{U \ H) < P([/) — e. We have 

F{U) < F{H) ■ F{U I H) + F{H) • 1 < (1 - e) • {F{U) - e) + e = F{U) - e ■ F{U) 

yielding a contradiction which proves that P(y) > 0. 

As all runs can be partitioned into sets H, V, and H \ V CV, we have 

1 - £ < P(t/) < P(iL) • 1 + P(F) • 1 + P(fy) • F{U I V) 

where 1 overapproximates (potentially undefined) probabilities F{U \ H) and F{U \ V). 
Since F{H) < e and P({7 | < 1 — fyi, we obtain 

1 - 2£ < P(F) + P(y) • (1 - fye) 

1 - 2£ < P(l/) + (1 - P(t/)) • (1 - fye) 
fy£-2£ < P(l/)fy£ 

and so P(y) > 1 — 2-fyi. ◄ 

In fact, it suffices to set U = Ify+i and H = Wi and obtain Wi as V from the lemma. The 
probability of all such Wi is > 1 — k. It is easy to prove by induction that the probability 
of each Ify is 1 — 2 “* • ’ where Oi = ~ *) obtaining the first inequality as 

limi_>.oo Oi = 4. The second inequality is guaranteed by the properties of V by Lemma pdj 
Even the sets of Wi are still not enough for our proof, we would like to get sets of histories 
that are “connected” with high probability from anywhere within the accumulating phase. 
For all i and all h G Wi we apply the following lemma and obtain a prefix-free set of paths 
Zh such that P^-^ {Zh) > 1 - 4 • ■ and for all prefixes h' of any path in Zh we have 

P„^(W,+i I h') 

► Lemma 25. Let W be a set of runs such that 


F,^{W)>l-e 

then there is a prefix-free set V of finite paths of length I such that Pcrg(E) > 1 — 2 • £ • '/H 
and for all prefixes h' of a path in V we have 


F^^iW\h')>l-We 


Proof. For all fc, we can find a set 14 of paths of length k such that 
Faa{W I fy) > 1 — fye for all h' G 14; this is possible by Lemma 


■ cr® 
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(14) > 1 — 2v4 and 
where for H we take 

all paths of length £. The set V is then obtain V = fiXi=i 1^ (note that this is indeed a set 
of paths). ◄ 


Finally, we are ready to obtain the accumulating and reaching strategies. Below, last(h) 
is the last state of a path h. 


► Lemma 26. For i < i, h G Wi and h' G Zh, there is a strategy fh' that from last(h') 
almost surely reaches {last{h") \ h" G Wi+i} after passing through a fulfilled state. 


Proof. Such strategy always exists because of Lemma 
1 — 2 • by properties of elements of Wi and 
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and because F^^iWij^i 


w) > 

◄ 
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The following lemma can be easily obtained from Lemma |23| 

► Lemma 27. For all w G Wi, there is a memoryless deterministic strategy tTu, (in M.(^) 
which, when started in last{w), only ever reaches states and uses actions that occur on some 
history of . 

In addition, denote by tTs^c a strategy for w belonging to Wi for i = min{j | 3w' G 
Wj, last(w') = (s,C)}. By {Ms^Ct^s,c) we denote the tuple of sets of states and actions 
that TTg^c visits when started in (s,C). 

Let Cs,c be a strategy where w G [Jw'gw, * = min{j | 3w' G U^'gw^, ^ 

last{w') = (s, C)}. Let rank{{s,C)) = i. 

We inductively build {M,N) as follows. Initially, M = {last{w) \ w G Wi} and TV = 0. 
We then keep adding to M and N, until a fixpoint is reached, (i) the states and actions 
of {Ms^c, Ns^c) for all {s,C) G M, and (ii) last states of histories Wi+i for i such that 
there is (s,C) G M with rank{{s,C)) = i. We claim that this procedure is well-defined 
in the sense that the sets Wi+i in step (ii) above were always defined, i.e. that j < £ in 
every case. For this, we need to show that whenever 77(s,c) is taken in the definition, then 
rank{{s,C)) < £—1 = Letting rank{M) = m.a.x{rank{{s, C)) \ {s,C) G M}, we can 

argue that initially rank(M) = 1 and with every iteration of steps (i) and (ii) the rank 
increases at most by 1. Since only jS'^l elements can be added to M before a fixpoint is 
reached, we get that the bound on rank{M) is |S'®|. 

Now we claim that the Lemma flSl is satisfied. 

H As for the property of N, note that we were only adding states to N if they were last 
states of a history in a pseudo-accumulating phase, and by definition of 6a we have 
(^in, A) in the second component of such states for the initial states qin of the automata 
for all i G I. 

B For item[^ the strategy tt is defined as follows. Let hhe a, history starting in (s, C) G M, 
we put 7r(h) = 7r(s',C')(f^) where (s',C") is the element such that (s,C) G Ms',c'- For 
any other history we define tt arbitrarily. 

H For item|^ the strategy C, is defined as follows. Let h be a history starting in (s, C) G M, 
we put C(h) = C(s,c)(^)- For any other history we define ( arbitrarily. 

B.3 Details for proof of Theorem [^and Section 

Theorem The controller-synthesis problem for 1-fLTL for MDPs is solvable in time 
polynomial in the size of the model and doubly exponential in the size of the formula. 


4.2 


Proof. We now give a more detailed description of the algorithm that is presented in Sec¬ 
tion |T21 

1. Construct the automata for all ^ G {if, Lpi,..., (/?„} and / C {1,..., n}. 

2. Initialize T 0. 

3. Repeat the following for every I. Find the largest sets (M, N) satisfying the conditions 
[Iffil of Lemma [TSl It can be done as follows: 

_ Let trunc(M, TV) denote the tuple {M',N') that contains maximal subsets of M and 
N satisfying that for every s G M there is a G N such that A(s, a) is defined and 
for every s' contained in the support of A(s,a) we have s' G M. (Easily obtained by 
iteratively pruning actions and states violating the conditions.) 



Vojtech Forejt and Jan Krcal 


25 


_ We start with M = S(^ and N containing all “accumulating” actions (a, C) with 
I <lin is the initial state of R^p^j* for i G /*} C C. Then we apply the follow¬ 
ing steps until a fixpoint is reached: 

(a) (M,N) :=trunc(M,N); 

(b) Remove from M all states that do not satisfy item[^or itemof Lemma [T^ (Easily 
achieved by qualitative safety and reachability analysis in M^.) 

This yields a set (M, N), and we add to T all pairs (s, q) such that (s, {(g, *)}) g M. 

4. Compute an optimal strategy a' for “reaching” T (defined in Proposition and return 
the probability that a' “reaches” T. It can be done as follows: 

_ By we denote the “naive” product of M. with all the main Rabin automata 
for all / C {1,..., n}. Formally, fixing /q, ..., /m an enumeration of subsets of 
n}, the state space So of Mo contains tuples (s, q^°,..., where g^^ is a 
state of R^j , the set of actions is Ao = A, and the transition function Ao is given 

by 


Ao((s,g^“,...,g^’"),a)(t,g^“,...g^") = A{s,a){t) 

when for every 0 < i < m, we have g^ A q'^'. 

_ Furthermore, let C So be the set of all (s, q^°,..., gW) g.t. there is i with (s, g^) g 

T. 

- By the construction, we easily obtain equivalence of strategies of the following form. 
For any a in M there is crq in At and also for any crq there is a such that 

Pcr({a; I w reaches a pair from T}) = Pct^({w | uj reaches some state from To}). 

Let us prove the statement. For any finite or infinite path u = ... in Ad 

initiated in Sin there is a unique path coo = (sqj Qq”) ■ ■ • > Qo"') 0 'oisi,ql° ,..., g("‘)ai ... 
with Si = s' and at = a' for all i. For a fixed tro define a by <j{h) = cro(^o) for 
all h. Similarly, for a fixed ct, we define ao by ao(ho) = a{h) for all h. The equality 
easily follows from the definitions. 

_ The above statement allows us to compute an optimal strategy ao in Mo using 
ordinary reachability algorithms and set a' to the corresponding strategy Ad. 

Let us now analyse the complexity of the algorithm in more detail. Each of the Rabin 
automata in step 1. above can be computed in time 2 ^’’ ^ and since there is exponentially 

many such automata (in |(/?|), step 1. takes time 2^'” in step 3., for a fixed /, M and 

N the result of trunc(M, A) can be computed in polynomial time in the size of M and 
N; the same holds for satisfaction of the conditions in (b). The size of M C and 
N C A^ is poly(S') • and for a fixed I the fixpoint is reached in at most [S'®! • 

iterations. Moreover, there is at most 21*^1 different Is. Hence, step 3. can be performed 
in time poly(S') -2^ . Finally, in step 4. we are computing reachability probability in 

a MDP Mo which is of size poly(S') • 2^’’ and so also this step can be done in time 
poly(S') • 22 "°'’''This completes proof of Theorem]^ ◄ 


